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A simple spin system is studied as an analog for proteins. We investigate how the 
introduction of randomness and frustration into the system effects the designability 
and stability of ground state configurations. We observe that the spin system exhibits 
protein-like behavior in the vicinity of the transition between ferromagnet and spin 
glass. Our results illuminate some guiding principles in protein evolution. 
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The folding of a protein into a specific three-dimensional (3D) biologically active 
structure is now often described by the funnel concept Q. It is assumed that the 
energy landscape of a protein is rugged but with a sufficient overall slope towards the 
native structure ||. Folding occurs by a multi-pathway kinetics and the particulars 
of the folding funnel determine the transitions between the different thermodynamic 
states HJH- While originally derived from studies of minimalistic protein models, 
evidence for the validity of the funnel concept was subsequently presented for real 
proteins [§}]. 

A funnel-like energy landscape guarantees thermodynamic stability and kinetic 
accessibility for the biologically active structure of proteins. Both are necessary con- 
ditions for proteins to perform their biological functions. Hence, the funnel concept 
suggests that the optimal state of a protein is one of minimal frustration ||. This 
is because a smoother energy landscape and a steeper slope leads to faster folding 
and greater stability. However, proteins are in general only marginally stable ||, 
and both stability and speed of folding can often be increased in protein engineering 
Hence, it appears that the sequence of amino acids in a protein is in general 
not optimized for smoothness of its energy landscape. The question arises then on 
why is this the case and why are proteins only marginally stable. Or what factors 
constraint the amount of frustration (and the ruggedness of the funnel landscape) in 
the evolution of proteins? 

When studying the above questions one encounters the problem that the amount 
of frustration is difficult to control in protein models. For this reason, we propose to 
use the frustrated 3D-Ising model on a simple cubic lattice || with periodic boundary 
conditions as an analogy of proteins, and to study the above questions for this much 
simpler system in which the frustration can be easily measured. Unlike in earlier 
work |],1C] we interpolate continuously between the ferromagnet and the spin glass 



by varying the density of antiferromagnetic bonds. Our choice of the system is 
motivated by the observation that proteins are similar to spin glasses in that their 



energy landscape is characterized by a huge number of local minima separated by high 
energy barriers pi] . On the other hand, the global funnel-like topology of protein 
energy landscapes, leading to an unique ground state, resembles more a ferromagnet. 
Hence, it seems that proteins show behavior between that of a ferromagnet and a spin 
glass. However, the limitations of the analogy between the frustrated Ising model 
and proteins should be kept in mind. Spin systems do not fold, only the process 
by which the system finds its ground state can be regarded as analogous to folding. 
We can study only how for Ising models this process depends on the frustration and 
under which conditions there are similarities to proteins. 
Our model is described by the Hamiltonian 

3N 

H = — 7^ Jlm&lO'm 

<lm> 

where the sum goes over all 3N (N the number of spins) pairs < Im > of nearest 
neighbor spins cij = ±1. A certain number M of randomly chosen bond variables, 
Ji m , are set to Ji m = — 1 while the remaining 3N — M bonds are assigned the value 
Ji m = 1. The ratio R = M/3N is a measure for the randomness in our Ising system 
and leads to the frustration in the systems which is as usual defined through 

F = ^ S(F Di , -1) with Fn i = J 12 J 23 J 34 Ju ■ 

Here, Ji2,J23,J34,Ju are the four bond variables of the i-th elementary plaquette Dj 
of the lattice, and the sum goes over all 3N elementary plaquettes. 

Our simulations are done on a 4 x 4 x 4 lattice which is small enough that 
simulated annealing will find the ground state. An even smaller lattice size may 
have allowed exhaustive enumeration, but would have introduced severe finite size 
effects. For a given value F of frustration, 2000 realizations of bond variables { Ji m } 
are generated in random. For each realization, N\ simulated annealing runs are used 
to search for the global minimum. In each run we cool down the system with step 
size AT=0.1 from temperature T=3 to T=0.3 performing 40 Monte Carlo sweeps 



(one update for each spin) at each temperature. We define as ground state C g of 
one realization the configuration with the lowest energy E g obtained in the N\ runs. 
To ensure reasonable statistics, we require that this energy is found in at least N2 
simulated annealing runs. The total number Ni of runs is adjusted accordingly and 
the failure rate Np = (iVi — -/V 2 )/iVi defines an index for the difficulty to find the 
global minimum. In the next step, we check the N2 ground state configurations for 
rotational and translational symmetries, and identify in this way the number N g of 
distinct ground state configurations found for the given realization. For small values 
of R we set N2 = 10, 000. As the system approaches the spin glass, N g increases 
rapidly. Therefore, if N g > 1000, we repeat the simulation with N2 = 100, 000 to 
obtain more accurate values for N g . 

By altering the frustration F we can tune our system between a ferromagnet ( 
F = 0) and a spin glass (< F > a v= 0.5) and investigate the relation between F 
in the system and the occurrence of protein-like behavior. Since the native state 
of a protein is unique and commonly assumed to be its ground state, we define a 
realization {Jim} as protein-like if it has a single ground state. The number of protein- 
like realization {Jim} among 2000 realizations is denoted by N$g- We display the 
frequency fsc = iVsG/2000 of such realizations as a function of F in Fig. |l[ which 
shows that fsc decreases with growing F and is almost constant for F > 0.44. 
The inset of Fig. |] shows the same quantity as a function of R and here flattening 
occurs for R > 0.23. Hence, the probability to find protein-like realizations decreases 
as a function of F (or R). However, the total number of realizations is given by 
N Realizations = (SN)\/[(3N(1 - R))\(3NR)\], i.e. grows much faster with increasing 
R. It follows that the total number of protein-like realizations which can be designed 
(a randomly chosen realization has vanishing small probability for a single ground 
state!) is also an increasing function of F since the bond randomness R and the 
average frustration over realizations < F > av are related through < F(R) > av = 
A((1-R) 3 R+(1-R)R 3 ) @. 
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We know that with growing F the energy landscape becomes more and more 
rugged. The number of local minima separated by high energy barriers will grow, 
and the probability will increase that our simulated annealing runs get trapped in 
one of them and do not find the global minimum. This can be seen in Fig. |2| where 
we display the average failure rate < Np > as a function of F for the case of all 2000 
samples and for only these realizations with single ground state N g = 1. In this plot 
we observe a steep increase of < Np* > at F g = 0.44±0.02 for the curve corresponding 
to the "all samples" case. Note that this value corresponds to R g = 0.23 ±0.02 which 
is consistent with that for the transition between ferromagnetic and spin-glass order 
found in p2]| . The transition between the ferromagnet and the spin glass can also be 
observed in the average number of ground states per realization < N g > as a function 
of F which we display in the inset of Fig. ||. The location of the steep increase in 
this quantity, F g = 0.44 ± 0.02 (which corresponds to R g = 0.23 ± 0.02), is the same 
as for the failure rate and agrees with the point in [12|. 



The failure rate Np in Fig. |2| measures how often a simulation did not find the 
ground state and is therefore related to the "folding time" , i.e. the time which would 
be necessary to find the ground state in a simulation. The "folding time" itself is 
a measure for the kinetic accessibility of the ground states. For the frustrated Ising 
model we see from Fig. |2| that the failure rate (and consequently the "folding time" ) 
is small for small F and differs little from the time needed for the ferromagnet F = 0. 
This changes once we reach values of F where the system behaves as a spin glass. At 
that point the failure rate and the "folding time" increases by orders of magnitude, 
and even for realizations {Jim} which have a single ground state, that state may no 
longer be kinetically accessible. Such a situation is not desirable for real proteins, 
which have only limited time to fold and therefore must have kinetically accessible 
native states. Hence, we can not assume that realizations {Jim} with F > 0.44 ±0.02 
are protein-like even if they have a unique ground state. If the analogy between 
proteins and spin systems holds, then we can expect for proteins also an interplay 



between the increasing entropy of sequences, which lead to an unique ground state 
structures, and the requirement that this state has to be kinetically accessible. On 
one hand the entropy of sequences increases with frustration while on the other hand 
the folding times become prohibitively large once the frustration exceeds a certain 
value. In the Ising model the transition to this spin glass behavior is pronounced 
and located at F g = 0.44 ± 0.02 (R g = 0.23 ± 0.02). The above conjecture may 
explain why proteins are marginally stable: the entropy of marginally stable proteins 
is much higher than that of sequences optimized for thermodynamic stability and fast 
folding. However, a limiting minimal amount of thermodynamic stability is necessary 
to guarantee function of the protein. 

The above conjecture implies that the "optimal" amount of frustration in proteins 
is where the system is "almost" at the point of becoming a spin glass. This is because 
in such a case the entropy of sequences which lead to a single and accessible ground 
state is maximal. However, a protein should also be stable in the sense that a muta- 
tion will not lead to an amino sequence with a different native structure or no unique 
ground state at all. Hence, such protein structures are preferred which can be real- 



ized by a maximal number of different amino acid sequences [13|. In the language 
of our spin system the above statement implies that these spin configurations are 
most protein-like which are single ground state for the largest number of realizations 
{Jim}. For this reason, we have further checked the N$g protein-like ground state 
configurations on translational and rotational symmetries. This procedure leads to 
a much smaller number N& of distinct single ground state configurations. Np is dis- 
played as a function of F in the inset of Fig. |3[ Nrj is an increasing function over the 
whole ferromagnetic range and more or less constant in the spin glass range. Hence, 
with increasing value of F not only the total number of protein-like realizations grows 
but also the variety of protein-like states. 

^From the inset of Fig. |3| we would expect that the situation in proteins would 
correspond to small values of frustration F in the Ising model where one single 
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ground state configuration dominates, which can be realized by many sets of bond 
variables {J/ m }. However, proteins have to change over the course of evolution. The 
requirement of evolutionary flexibility suggests that larger values of randomness and 
frustration should be preferred which increase the number of distinct ground state 
structures and enhance the chance that a mutation will lead from one structure to 
different one. Hence, we expect for proteins an interplay between the requirement 
that the native structure is stable under mutations, and the need for structural 
changes over the course of evolution. 

In order to study this interplay we plot in Fig. || the ratio Nd /Nqq. Note that this 
ratio corresponds to the inverse of the (averaged) "designability" |]l3| and is a measure 
for the degeneracy of the various protein-like states (i.e. spin-configurations which are 
unique ground states for some realizations { Ji m }) of our spin system. We see that this 
ratio has a step-like behavior at F p = 0.41 ± 0.02 (which corresponds to R p = 0.17 ± 
0.02). For smaller values of F the Njj types of ground state configurations are realized 
by many sets { J; m }, while for larger values of F each spin configuration is realized by 
only one realization {Jim}- Hence, we conclude that in our spin system the "optimal" 
frustration is at F p where both a variety of different protein-like configurations can 
be realized, but at the same time these structures can be designed by more than 
one set of {Jim}, and therefore are stable under mutations |14}| . Note that this point 
is close to, but smaller than, the glass transition point {F g = 0.44 ± 0.02). Our 
value of F p also corresponds to the point where in Fig. || failure rate of realizations 
with single ground state diverges from the corresponding plot for all realizations: 
F = 0.41 ±0.03. 

The above results suggest that in protein-like systems randomness and frustration 
is necessary to increase the designability of proteins. In our spin system, the absolute 
number of realizations with a single ground state will increase with frustration. On 
the other hand, once the frustration exceed a certain value, the system becomes 
a spin glass. The resulting rugged energy landscape implies now that the single 
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ground state, if existing, is no longer kinetically accessible. This would be biologically 
not desirable, and the frustration in proteins has to be below this critical value. 
In a similar fashion, the evolutionarily favorable increase in diversity of protein- 
like states with frustration is counteracted by the growing probability that a given 
configuration becomes unstable under mutations. If the frustration exceed a certain 
value, any mutation would lead to a different structure which is again biologically 
not desirable. We conjecture that proteins are not minimal frustrated but that in 
protein-like systems the competition between these factors leads to a maximal value 
of F where the number of different kinetically accessible structures, which can be 
realized as single ground states by many sequences, is largest. For our spin system 
this point is F p = 0.41 ± 0.02 which is close to, but below the point F g = 0.44 ± 0.02 
where the system starts to behave as a spin glass. 

In order to demonstrate how the interplay of the above outlined factors may 
lead to an optimal value of F, we have made up the following game. Our starting 
point is the ferromagnet, i.e. Ji m = 1. The game consists of a series of Monte 
Carlo steps which simulate "evolution". At each Monte Carlo step our system has 
two offspring before it dies. One of the offspring is a copy of the parent, the other 
carries a mutation. We simulate mutations by chosing at random one bond variable 
Ji m and switching its sign. Only one of the offspring is allowed to survive, and the 
survival rate of the "mutant" is given by P(Fn)/(P{Fn) + P(Fq)). Here, i*V and Fq 
are the frustration of the "mutant" and the "unchanged system" , respectively, with 
P(F) = f SG (F)(l- < N F (F) >)(1 - N d (F)/N S g(F)), where f SG (F), < N F (F) >, 
and Nd/Nsg are taken from our previous simulations and < Np(F) > corresponds 
to the curve < Np(F) > with N g = 1 in Fig. |l|. With these rules our system performs 
a random walk in F shown in Fig. |3|. The average value of F throughout this random 
walk gives F = 0.42 ± 0.03 which is consistent with F p = 0.41 ± 0.02 and supports 
our assumption that the evolution of protein-like systems leads to a optimal point of 
F in the system. 
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In summary, we have studied the simple frustrated Ising model as an analog for 
proteins. Investigating this system as a function of frustration, we found that the spin 
system exhibits protein-like behavior at or slightly below the point at which a system 
changes from an ordered (ferromagnet) to a random system (spin glass). Whether this 
observation (which questions the common belief that proteins are minimal frustrated 
systems) holds for realistic protein models remains to be investigated. As a next 
step in this direction we have started simulations of a bond-diluted and site-diluted 
frustrated Ising model. In such a model, it may be possible to generate more realistic 
protein-like structures with backbone and side chains. 
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FIGURES 

FIG. 1. The frequency fsG = Nsg/Nt of realizations with single ground state as a function of 
F and (inset) R. 

FIG. 2. The average failure rate < Np > as a function of F. In the inset we display the average 
number < N g > of ground states as a function of F. 

FIG. 3. The ratio Nd/Nsg as a function of F. In the inset we show the number Np of truly 
different single ground state configurations, as a function of F. 

FIG. 4. Time series of the bond randomness F from a dynamic simulation described in the 
text. 
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Fig. 1 Lin, Hu, and Hansmann 




Fig. 3 Lin, Hu, and Hansmann 




Fig. 4 Lin, Hu, and Hansmann 




